Space-Efficient Dictionaries for Parameterized and Order-Preserving Pattern Matching
نویسندگان
چکیده
Let S and S′ be two strings, having the same length, over a totally-ordered alphabet. We consider the following two variants of string matching. Parameterized Matching: The characters of S and S′ are partitioned into static characters and parameterized characters. The strings are a parameterized match iff the static characters match exactly, and there exists a one-to-one function which renames the parameterized characters in S to those in S′. Order-Preserving Matching: The strings are an order-preserving match iff for any two integers i, j ∈ [1, |S|], S[i] ≺ S[j] ⇐⇒ S′[i] ≺ S′[j], where ≺ denotes the precedence order of the alphabet. Let P be a collection of d patterns {P1,P2, . . . ,Pd} of total length n characters, which are chosen from a totally-ordered alphabet Σ. Given a text T , also over Σ, we consider the dictionary indexing problem under the above definitions of string matching. Specifically, the task is to index P, such that we can report all positions j (called occurrences) where at least one of the patterns Pi ∈ P is a parameterized match (resp. an order-preserving match) with the samelength substring of T starting at j. Previous best-known indexes occupy O(n logn) bits, and can report all occ occurrences in O(|T | log |Σ| + occ) time. We present space-efficient indexes that occupy O(n log |Σ|+d logn) bits, and reports all occ occurrences in O(|T |(log |Σ|+log|Σ| n)+occ) time for parameterized matching, and in O(|T | logn+ occ) time for order-preserving matching. 1998 ACM Subject Classification F.2.2 Pattern Matching ∗ The work of Arnab Ganguly was supported by National Science Foundation Grants CCF–1017623 and CCF–1218904. † The work of Wing-Kai Hon was supported by National Science Council Grants 102-2221-E-007-068-MY3 and 105-2918-I-007-006. © Arnab Ganguly, Wing-Kai Hon, Kunihiko Sadakane, Rahul Shah, Sharma V. Thankachan, and Yilin Yang; licensed under Creative Commons License CC-BY 27th Annual Symposium on Combinatorial Pattern Matching (CPM 2016). Editors: Roberto Grossi and Moshe Lewenstein; Article No. 2; pp. 2:1–2:12 Leibniz International Proceedings in Informatics Schloss Dagstuhl – Leibniz-Zentrum für Informatik, Dagstuhl Publishing, Germany 2:2 Compact Parameterized and Order-Preserving Dictionaries
منابع مشابه
A W[1]-Completeness Result for Generalized Permutation Pattern Matching
The NP-complete Permutation Pattern Matching problem asks whether a permutation P (the pattern) can be matched into a permutation T (the text). A matching is an order-preserving embedding of P into T . In the Generalized Permutation Pattern Matching problem one can additionally enforce that certain adjacent elements in the pattern must be mapped to adjacent elements in the text. This paper stud...
متن کاملOrder-preserving matching
We introduce a new string matching problem called order-preserving matching on numeric strings where a pattern matches a text if the text contains a substring whose relative orders coincide with those of the pattern. Order-preserving matching is applicable to many scenarios such as stock price analysis and musical melody matching in which the order relations should be matched instead of the str...
متن کاملParameterized Pattern Matching - Succinctly
The fields of succinct data structures and compressed text indexing have seen quite a bit of progress over the last 15 years. An important achievement, primarily using techniques based on the Burrows-Wheeler Transform (BWT), was obtaining the full functionality of suffix tree in the optimal number of bits. A crucial property that allows the use of BWT for designing compressed indexes is order-p...
متن کاملOrder-Preserving Pattern Matching with k Mismatches
We study a generalization of the order-preserving pattern matching recently introduced by Kubica et al. (Inf. Process. Let., 2013) and Kim et al. (submitted to Theor. Comp. Sci.), where instead of looking for an exact copy of the pattern, we only require that the relative order between the elements is the same. In our variant, we additionally allow up to k mismatches between the pattern of leng...
متن کاملAn Encoding for Order-Preserving Matching
Encoding data structures store enough information to answer the queries they are meant to support but not enough to recover their underlying datasets. In this paper we give the first encoding data structure for the challenging problem of order-preserving pattern matching. This problem was introduced only a few years ago but has already attracted significant attention because of its applications...
متن کامل